850 research outputs found
The evolution of auditory contrast
This paper reconciles the standpoint that language users do not aim at improving their sound systems with the observation that languages seem to improve their sound systems. Computer simulations of inventories of sibilants show that Optimality-Theoretic learners who optimize their perception grammars automatically introduce a so-called prototype effect, i.e. the phenomenon that the learner’s preferred auditory realization of a certain phonological category is more peripheral than the average auditory realization of this category in her language environment. In production, however, this prototype effect is counteracted by an articulatory effect that limits the auditory form to something that is not too difficult to pronounce. If the prototype effect and the articulatory effect are of a different size, the learner must end up with an auditorily different sound system from that of her language environment. The computer simulations show that, independently of the initial auditory sound system, a stable equilibrium is reached within a small number of generations. In this stable state, the dispersion of the sibilants of the language strikes an optimal balance between articulatory ease and auditory contrast. The important point is that this is derived within a model without any goal-oriented elements such as dispersion constraints
The violability of backness in retroflex consonants
This paper addresses remarks made by Flemming (2003) to the effect that his analysis of the interaction between retroflexion and vowel backness is superior to that of Hamann (2003b). While Hamann maintained that retroflex articulations are always back, Flemming adduces phonological as well as phonetic evidence to prove that retroflex consonants can be non-back and even front (i.e. palatalised). The present paper, however, shows that the phonetic evidence fails under closer scrutiny. A closer consideration of the phonological evidence shows, by making a principled distinction between articulatory and perceptual drives, that a reanalysis of Flemming’s data in terms of unviolated retroflex backness is not only possible but also simpler with respect to the number of language-specific stipulations
Loanword adaptation as first-language phonological perception
We show that loanword adaptation can be understood entirely in terms of phonological and phonetic comprehension and production mechanisms in the first language. We provide explicit accounts of several loanword adaptation phenomena (in Korean) in terms of an Optimality-Theoretic grammar model with the same three levels of representation that are needed to describe L1 phonology: the underlying form, the phonological surface form, and the auditory-phonetic form. The model is bidirectional, i.e., the same constraints and rankings are used by the listener and by the speaker. These constraints and rankings are the same for L1 processing and loanword adaptation
Modelling the formation of phonotactic restrictions across the mental lexicon
Experimental data shows that adult learners of an artificial language with a phonotactic restriction learned this restriction better when being trained on word types (e.g. when they were presented with 80 different words twice each) than when being trained on word tokens (e.g. when presented with 40 different words four times each) (Hamann & Ernestus submitted). These findings support Pierrehumbert’s (2003) observation that phonotactic co-occurrence restrictions are formed across lexical entries, since only lexical levels of representation can be sensitive to type frequencies
Analyzing Input and Output Representations for Speech-Driven Gesture Generation
This paper presents a novel framework for automatic speech-driven gesture
generation, applicable to human-agent interaction including both virtual agents
and robots. Specifically, we extend recent deep-learning-based, data-driven
methods for speech-driven gesture generation by incorporating representation
learning. Our model takes speech as input and produces gestures as output, in
the form of a sequence of 3D coordinates. Our approach consists of two steps.
First, we learn a lower-dimensional representation of human motion using a
denoising autoencoder neural network, consisting of a motion encoder MotionE
and a motion decoder MotionD. The learned representation preserves the most
important aspects of the human pose variation while removing less relevant
variation. Second, we train a novel encoder network SpeechE to map from speech
to a corresponding motion representation with reduced dimensionality. At test
time, the speech encoder and the motion decoder networks are combined: SpeechE
predicts motion representations based on a given speech signal and MotionD then
decodes these representations to produce motion sequences. We evaluate
different representation sizes in order to find the most effective
dimensionality for the representation. We also evaluate the effects of using
different speech features as input to the model. We find that mel-frequency
cepstral coefficients (MFCCs), alone or combined with prosodic features,
perform the best. The results of a subsequent user study confirm the benefits
of the representation learning.Comment: Accepted at IVA '19. Shorter version published at AAMAS '19. The code
is available at
https://github.com/GestureGeneration/Speech_driven_gesture_generation_with_autoencode
Unattended distributional training can shift phoneme boundaries
Listeners are sensitive to speech sounds' probability distributions. Distributional training (DT) studies with adults typically involve conscious activation of phoneme labels. We show that distributional exposure can shift existing phoneme boundaries (Spanish /e/-/i/) pre-attentively. Using a DT paradigm involving two bimodal distributions we assessed listener's neural discrimination across three sounds, showing pre-to-post-test improvement for the two adjacent sounds that fell into different clusters of the trained distribution than for those that fell into one cluster. Upon unattended exposure to an intricate stimulus set, listeners thus relocate native phoneme boundaries. We assessed whether the paradigm also works for category creation (Spanish establishing a duration contrast), where it has methodological advantages over the usual unimodal-versus-bimodal paradigm. DT yielded a greater effect for the /e/-/i/ boundary shift than for duration contrast creation. It seems that second-language phoneme contrasts similar to native ones might be easier to acquire than new contrasts
The Effects of Silver Nanoparticles on Soybean (Glycine max) Growth and Nodulation
Due to their antimicrobial properties, silver nanoparticles (AgNPs) have become more popular in consumer and industrial products, leading to increasing agricultural and environmental concentrations. Exposure to AgNPs could be detrimental to plants, microbes, and their symbiotic relationships. When subjected to 10 µg/mL AgNPs in a 96-well plate, growth of Bradyrhizobium japonicum USDA 110 was halted. In hydroponic culture with 2.5 µg/mL AgNPs, biomass of inoculated Glycine max (L.) Merr. was 50% of control. Axenic plants were unaffected by this dose, but growth was inhibited at higher doses, indicating that AgNPs inhibit both nodulation and growth. Nodules treated with 2.5 µg/mL AgNPs were absent of bacteroids, and plants given 0.5-2.5 µg/mL AgNPs had 40-65% decreased nitrogen fixation. In conclusion, I determined AgNPs not only interfere with plant-microbe relations but also with general plant and bacterial growth. As a consequence, we should be mindful of not releasing AgNPs to the environment and agricultural land
- …